A survey on semi-Markov decision processes
نویسندگان
چکیده
منابع مشابه
Semi-markov Decision Processes
Considered are infinite horizon semi-Markov decision processes (SMDPs) with finite state and action spaces. Total expected discounted reward and long-run average expected reward optimality criteria are reviewed. Solution methodology for each criterion is given, constraints and variance sensitivity are also discussed.
متن کاملSemi-Markov Decision Processes
The previous chapter dealt with the discrete-time Markov decision model. In this model, decisions can be made only at fixed epochs t = 0, 1, . . . . However, in many stochastic control problems the times between the decision epochs are not constant but random. A possible tool for analysing such problems is the semiMarkov decision model. In Section 7.1 we discuss the basic elements of this model...
متن کاملTowards Analysis of Semi-Markov Decision Processes
We investigate Semi-Markov Decision Processes (SMDPs). Two problems are studied, namely, the time-bounded reachability problem and the long-run average fraction of time problem. The former aims to compute the maximal (or minimum) probability to reach a certain set of states within a given time bound. We obtain a Bellman equation to characterize the maximal time-bounded reachability probability,...
متن کاملHard Constrained Semi-Markov Decision Processes
In multiple criteria Markov Decision Processes (MDP) where multiple costs are incurred at every decision point, current methods solve them by minimising the expected primary cost criterion while constraining the expectations of other cost criteria to some critical values. However, systems are often faced with hard constraints where the cost criteria should never exceed some critical values at a...
متن کاملl AVERAGE COST SEMI - MARKOV DECISION PROCESSES
^ The Semi-Markov Decision model is considered under the criterion of long-run average cost. A new criterion, which for any policy considers the limit of the expected cost Incurred during the first n transitions divided by the expected length of the first n transitions, is considered. Conditions guaranteeing that an optimal stationary (nonrandomized) policy exist are then presented. It is also ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: SCIENTIA SINICA Mathematica
سال: 2015
ISSN: 1674-7216
DOI: 10.1360/n012015-00041